Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Differential privacy high-dimensional data publishing method via clustering analysis
CHEN Hengheng, NI Zhiwei, ZHU Xuhui, JIN Yuanyuan, CHEN Qian
Journal of Computer Applications    2021, 41 (9): 2578-2585.   DOI: 10.11772/j.issn.1001-9081.2020111786
Abstract330)      PDF (1281KB)(317)       Save
Aiming at the problem that the existing differential privacy high-dimensional data publishing methods are difficult to take into account both the complex attribute correlation between data and computational cost, a differential privacy high-dimensional data publishing method based on clustering analysis technology, namely PrivBC, was proposed. Firstly, the attribute clustering method was designed based on the K-means++, the maximum information coefficient was introduced to quantify the correlation between the attributes, and the data attributes with high correlation were clustered. Secondly, for each data subset obtained by the clustering, the correlation matrix was calculated to reduce the candidate space of attribute pairs, and the Bayesian network satisfying differential privacy was constructed. Finally, each attribute was sampled according to the Bayesian networks, and a new private dataset was synthesized for publishing. Compared with PrivBayes method, PrivBC method had the misclassification rate and running time reduced by 12.6% and 30.2% averagely and respectively. Experimental results show that the proposed method can significantly improve the computational efficiency with ensuring the data availability, and provides a new idea for the private publishing of high-dimensional big data.
Reference | Related Articles | Metrics